Using knowledge to organize sound: The prediction-driven approach to computational auditory scene analysis and its application to speech/nonspeech mixtures

نویسنده

  • Daniel P. W. Ellis
چکیده

Computational auditory scene analysis – modeling the human ability to organize sound mixtures according to their sources – has experienced a rapid evolution as the simple principles suggested by psychological experiments have turned out to be less than the whole story. Phenomena such as the continuity illusion and phonemic restoration show that the brain is able to use a wide range of knowledge-based contextual constraints when interpreting obscured or complex mixtures: To model such processing, we need architectures that operate by confirming hypotheses about the observations rather than relying on directly-extracted descriptions. One such architecture, the 'prediction-driven' approach, is presented along with results from its initial implementation. This architecture can be extended to take advantage of the high-level knowledge implicit in today's speech rec-ognizers by modifying a recognizer to act as one of the 'component models' which provide the explanations of the signal mixture. Although this adaptation raises a number of issues, a preliminary investigation supports the argument that successful scene analysis must exploit such abstract knowledge at every level.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computational Auditory Scene Analysis Exploiting Speech-recognition Knowledge

The field of computational auditory scene analysis (CASA) strives to build computer models of the human ability to interpret sound mixtures as the combination of distinct sources. A major obstacle to this enterprise is defining and incorporating the kind of high level knowledge of real-world signal structure exploited by listeners. Speech recognition, while typically ignoring the problem of non...

متن کامل

Prediction-driven Computational Auditory Scene Analysis for Dense Sound Mixtures

We interpret the sound reaching our ears as the combined effect of independent, sound-producing entities in the external world; hearing would have limited usefulness if were defeated by overlapping sounds. Computer systems that are to interpret real-world sounds – for speech recognition or for multimedia indexing – must similarly interpret complex mixtures. However, existing functional models o...

متن کامل

Toward Automatic Sound Source Recognition: Identifying Musical Instruments

One of the broad goals of research in computational auditory scene analysis (CASA) is to create computer systems that can learn to recognize sound sources in a complex auditory environment. In this paper, a set of acoustic features is proposed that relate to the physical properties of sound-producing objects. In particular, a set of orchestral musical instrument sounds is presented as represent...

متن کامل

The auditory organization of speech and other sources in listeners and computational models

Speech is typically perceived against a background of other sounds. Listeners are adept at extracting target sources from the acoustic mixture reaching the ears. The auditory scene analysis account holds that this feat is the result of a two stage process. In the first stage, sound is decomposed both within and across auditory nuclei. Subsequent processes of perceptual organisation are informed...

متن کامل

Title : The auditory organization of speech and other sources in listeners and computational models

Speech is typically perceived against a background of other sounds. Listeners are adept at extracting target sources from the acoustic mixture reaching the ears. The auditory scene analysis account holds that this feat is the result of a two stage process: In the first stage sound is decomposed into collections of fragments in several dimensions. Subsequent processes of perceptual organization ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Speech Communication

دوره 27  شماره 

صفحات  -

تاریخ انتشار 1999